eXtended Cumulated Gain Metrics for the Evaluation of Content-oriented XML Retrieval

نویسنده

  • Gabriella Kazai
چکیده

We propose and evaluate a family of metrics, named the eXtended Cumulated Gain (XCG) metrics, for the evaluation of content-oriented XML retrieval approaches. Our aim is to provide an evaluation framework that allows to consider the dependency that exists among XML document components and, in particular, incorporate mechanisms to reward the retrieval of so-called nearmisses and to address issues of overlap. Both system and user-oriented evaluation aspects are considered and both recall and precision-like qualities are measured. Another consideration is that the metrics should be flexible enough so that different models of user behaviour may be instantiated within. We evaluate the proposed XCG metrics, based on the INEX test collection, both with respect to their fidelity and reliability. For example, the effects of assessment variation and topic set size on evaluation stability are investigated, and upper and lower bounds of expected error rates are established. The evaluation demonstrates that the proposed XCG metrics are reliable, stable measures, and in particular that the novel metric, effort-precision and gain-recall (ep/gr), shows comparable behaviour to established measures like precision and recall.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines

Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...

متن کامل

Proceedings of the SIGIR 2007 Workshop on Focused Retrieval

Determining the effectiveness of XML retrieval systems is crucial for improving information retrieval from XML document collections. Traditional effectiveness measures do not address the problem of overlap in the recall-base. At the Initiative for the Evaluation of XML retrieval (INEX), extended cumulated gain (XCG) was developed to address overlap. It works by comparing the cumulated score of ...

متن کامل

Overview of the Initiative for the Evaluation of XML retrieval (INEX) 2002

The INitiative for the Evaluation of XML retrieval (INEX) aims at providing an infrastructure for evaluating the effectiveness of content-oriented XML retrieval. In the first round of INEX, in 2002, a test collection of real world XML documents along with standard topics and respective relevance assessments has been created. Research groups from 36 different organisations participated in this c...

متن کامل

Tolerance to Irrelevance: A User-effort Oriented Evaluation of Retrieval Systems without Predefined Retrieval Unit

Video and XML retrieval test collections call for evaluation metrics that do not require a predefined retrieval unit. The use of traditional recall and precision metrics is problematic due to issues caused by ‘overlap’ between result and reference items. The paper proposes evaluation metrics derived from a user-effort oriented view of information retrieval to address these problems. It builds o...

متن کامل

Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML

As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006